Explanation: Typo tolerance in Afosto Instant Search

Before we sort documents, we use typo rules to combine documents that contain words similar to the words we're searching for.

Instant Search uses a special type of algorithm, the 'prefix Levenshtein algorithm', to check if words match. This algorithm accepts words that start with or are the same length as the searched words.

This algorithm looks at the smallest changes needed to turn one word into another, such as:

  • Replacing one letter with another letter (e.g., kitten → sitten)
  • Adding a letter (e.g., sitten → sitting)
  • Removing a letter (e.g., Saturday → Satuday)

There are rules that determine what can be considered "similar" (or a typo). These rules apply per word:

  • If the search term is 1 to 4 characters long, typos are not allowed.
  • If the search term is 5 to 8 characters long, one typo is allowed.
  • If the search term is more than 8 characters long, up to two typos are allowed.

For example, for the word "Saturday" (8 characters long), documents with up to two typos are accepted:

  • "Saturday" is accepted because it's the same word.
  • "Sat" is not accepted because it's not a prefix of the search term.
  • "Satuday" is accepted because it has one typo.
  • "Saruday" is accepted because it has two typos.
  • "Sariday" is not accepted because it has three typos.